Useful third-party libraries: exercises

Biopython

Can you count the number of sequences in the data/proteome.faa file?


In [ ]:

Can you plot the distribution of protein sizes in the data/proteome.faa file?


In [ ]:

Can you count the number of CDS sequences in the data/ecoli.gbk file?


In [ ]:

Can you compute the average root-to-tip distance in the data/tree.nwk file?


In [ ]:

Using biopython's classes, can you construct a dummy annotated nucleotide sequence?


In [ ]:

Can you query pubmed and extract the abstract of the first result when looking for your favorite protein?


In [ ]:

Networkx

Can you read the yeast protein interaction network in data/yeast.gml? Can you plot the degree distribution of the proteins contained in the graph?


In [ ]:

Using the same graph as the previous exercises, can you indicate the node with the highest centrality? Pick a centrality measure from the many available inside the networkx module.


In [ ]: